collaborative perception
- North America > United States > Connecticut > Tolland County > Storrs (0.14)
- Europe > Switzerland > Zürich > Zürich (0.14)
- North America > United States > Georgia > Fulton County > Atlanta (0.04)
- Asia > Myanmar > Tanintharyi Region > Dawei (0.04)
- Information Technology (1.00)
- Media > Photography (0.88)
- Information Technology > Artificial Intelligence > Vision (1.00)
- Information Technology > Artificial Intelligence > Machine Learning (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (0.94)
- Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles > Drones (0.40)
- Asia > China > Shanghai > Shanghai (0.04)
- North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
- North America > United States > California > Los Angeles County > Long Beach (0.04)
- (4 more...)
- North America > United States (0.04)
- Asia > Middle East > Israel (0.04)
- Asia > Japan > Honshū > Chūbu > Ishikawa Prefecture > Kanazawa (0.04)
- (2 more...)
- Information Technology (0.68)
- Transportation (0.46)
Where2comm: Communication-Efficient CollaborativePerceptionviaSpatialConfidenceMaps
In the simulation, we consider that the UAV swarm is flying over diverse simulated scenes at various altitudes. Each UAV has a sensing device to collect RGB images, a computation device to perceive the environment with a perception model, and a communication 9 device to transmit perception information among UAVs. In this setting, the UAV swarm is able to achieve 2D/3D object detection, pixel-wise or bird's-eye-view (BEV) semantic segmentation in a collaborative manner.
- North America > United States > California (0.14)
- Asia > China > Shanghai > Shanghai (0.05)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
- Information Technology > Artificial Intelligence > Machine Learning (1.00)
- Information Technology > Artificial Intelligence > Vision (0.96)
- Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (0.94)
Vision-Only Gaussian Splatting for Collaborative Semantic Occupancy Prediction
Chen, Cheng, Huang, Hao, Bagchi, Saurabh
Collaborative perception enables connected vehicles to share information, overcoming occlusions and extending the limited sensing range inherent in single-agent (non-collaborative) systems. Existing vision-only methods for 3D semantic occupancy prediction commonly rely on dense 3D voxels, which incur high communication costs, or 2D planar features, which require accurate depth estimation or additional supervision, limiting their applicability to collaborative scenarios. To address these challenges, we propose the first approach leveraging sparse 3D semantic Gaussian splatting for collaborative 3D semantic occupancy prediction. By sharing and fusing intermediate Gaussian primitives, our method provides three benefits: a neighborhood-based cross-agent fusion that removes duplicates and suppresses noisy or inconsistent Gaussians; a joint encoding of geometry and semantics in each primitive, which reduces reliance on depth supervision and allows simple rigid alignment; and sparse, object-centric messages that preserve structural information while reducing communication volume. Extensive experiments demonstrate that our approach outperforms single-agent perception and baseline collaborative methods by +8.42 and +3.28 points in mIoU, and +5.11 and +22.41 points in IoU, respectively. When further reducing the number of transmitted Gaussians, our method still achieves a +1.9 improvement in mIoU, using only 34.6% communication volume, highlighting robust performance under limited communication budgets.
- Asia > Middle East > UAE > Abu Dhabi Emirate > Abu Dhabi (0.04)
- Asia > Japan > Honshū > Chūbu > Ishikawa Prefecture > Kanazawa (0.04)
U2UData+: A Scalable Swarm UAVs Autonomous Flight Dataset for Embodied Long-horizon Tasks
Feng, Tongtong, Wang, Xin, Han, Feilin, Zhang, Leping, Zhu, Wenwu
Swarm UA V autonomous flight for Embodied Long-Horizon (ELH) tasks is crucial for advancing the low-altitude economy. However, existing methods focus only on specific basic tasks due to dataset limitations, failing in real-world deployment for ELH tasks. ELH tasks are not mere concatenations of basic tasks, requiring handling long-term dependencies, maintaining embodied persistent states, and adapting to dynamic goal shifts. This paper presents U2UData+, the first large-scale swarm UA V autonomous flight dataset for ELH tasks and the first scalable swarm UA V data online collection and algorithm closed-loop verification platform. The dataset is captured by 15 UA Vs in autonomous collaborative flights for ELH tasks, comprising 12 scenes, 720 traces, 120 hours, 600 seconds per trajectory, 4.32M LiDAR frames, and 12.96M RGB frames. This dataset also includes brightness, temperature, humidity, smoke, and airflow values covering all flight routes. The platform supports the customization of simulators, UA Vs, sensors, flight algorithms, formation modes, and ELH tasks. Through a visual control window, this platform allows users to collect customized datasets through one-click deployment online and to verify algorithms by closed-loop simulation. U2UData+ also introduces an ELH task for wildlife conservation and provides comprehensive benchmarks with 9 SOT A models.
- Information Technology > Robotics & Automation (1.00)
- Transportation > Air (0.85)
- Energy > Renewable > Geothermal (0.56)
Background Fades, Foreground Leads: Curriculum-Guided Background Pruning for Efficient Foreground-Centric Collaborative Perception
Wu, Yuheng, Gao, Xiangbo, Tau, Quang, Tu, Zhengzhong, Lee, Dongman
Abstract-- Collaborative perception enhances the reliability and spatial coverage of autonomous vehicles by sharing complementary information across vehicles, offering a promising solution to long-tail scenarios that challenge single-vehicle perception. However, the bandwidth constraints of vehicular networks make transmitting the entire feature map impractical. Recent methods, therefore, adopt a foreground-centric paradigm, transmitting only predicted foreground-region features while discarding the background, which encodes essential context. We propose FadeLead, a foreground-centric framework that overcomes this limitation by learning to encapsulate background context into compact foreground features during training. At the core of our design is a curricular learning strategy that leverages background cues early on but progressively prunes them away, forcing the model to internalize context into foreground representations without transmitting background itself. Extensive experiments on both simulated and real-world benchmarks show that FadeLead outperforms prior methods under different bandwidth settings, underscoring the effectiveness of context-enriched foreground sharing.
- North America > United States > Texas > Brazos County > College Station (0.04)
- Europe > France > Île-de-France > Paris > Paris (0.04)
- Information Technology > Artificial Intelligence > Machine Learning (1.00)
- Information Technology > Artificial Intelligence > Vision (0.94)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (0.47)
- Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (0.34)
SafeCoop: Unravelling Full Stack Safety in Agentic Collaborative Driving
Gao, Xiangbo, Lin, Tzu-Hsiang, Song, Ruojing, Wu, Yuheng, Huang, Kuan-Ru, Jin, Zicheng, Lin, Fangzhou, Liu, Shinan, Tu, Zhengzhong
Collaborative driving systems leverage vehicle-to-everything (V2X) communication across multiple agents to enhance driving safety and efficiency. Traditional V2X systems take raw sensor data, neural features, or perception results as communication media, which face persistent challenges, including high bandwidth demands, semantic loss, and interoperability issues. Recent advances investigate natural language as a promising medium, which can provide semantic richness, decision-level reasoning, and human-machine interoperability at significantly lower bandwidth. Despite great promise, this paradigm shift also introduces new vulnerabilities within language communication, including message loss, hallucinations, semantic manipulation, and adversarial attacks. In this work, we present the first systematic study of full-stack safety and security issues in natural-language-based collaborative driving. Specifically, we develop a comprehensive taxonomy of attack strategies, including connection disruption, relay/replay interference, content spoofing, and multi-connection forgery. To mitigate these risks, we introduce an agentic defense pipeline, which we call SafeCoop, that integrates a semantic firewall, language-perception consistency checks, and multi-source consensus, enabled by an agentic transformation function for cross-frame spatial alignment. We systematically evaluate SafeCoop in closed-loop CARLA simulation across 32 critical scenarios, achieving 69.15% driving score improvement under malicious attacks and up to 67.32% F1 score for malicious detection. This study provides guidance for advancing research on safe, secure, and trustworthy language-driven collaboration in transportation systems. Our project page is https://xiangbogaobarry.github.io/SafeCoop.
- North America > United States > Texas > Brazos County > College Station (0.04)
- Europe (0.04)
- North America > United States > Virginia (0.04)
- Transportation > Infrastructure & Services (1.00)
- Transportation > Ground > Road (1.00)
- Information Technology > Security & Privacy (1.00)
- Government (1.00)
- North America > United States > Connecticut > Tolland County > Storrs (0.14)
- Europe > Switzerland > Zürich > Zürich (0.14)
- North America > United States > Georgia > Fulton County > Atlanta (0.04)
- Asia > Myanmar > Tanintharyi Region > Dawei (0.04)
- Information Technology (1.00)
- Media > Photography (0.88)
- Information Technology > Artificial Intelligence > Vision (1.00)
- Information Technology > Artificial Intelligence > Machine Learning (1.00)
- Information Technology > Artificial Intelligence > Representation & Reasoning (0.94)
- Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles > Drones (0.40)